eeg recording
EEG-Bench: A Benchmark for EEG Foundation Models in Clinical Applications
Kastrati, Ard, Bürki, Josua, Lauer, Jonas, Xuan, Cheng, Iaquinto, Raffaele, Wattenhofer, Roger
We introduce a unified benchmarking framework focused on evaluating EEG-based foundation models in clinical applications. The benchmark spans 11 well-defined diagnostic tasks across 14 publicly available EEG datasets, including epilepsy, schizophrenia, Parkinson's disease, OCD, and mild traumatic brain injury. It features minimal preprocessing, standardized evaluation protocols, and enables side-by-side comparisons of classical baselines and modern foundation models. Our results show that while foundation models achieve strong performance in certain settings, simpler models often remain competitive, particularly under clinical distribution shifts. To facilitate reproducibility and adoption, we release all prepared data and code in an accessible and extensible format.
SAMBA: Toward a Long-Context EEG Foundation Model via Spatial Embedding and Differential Mamba
Hong, Jiazhen, Mackellar, Geoffrey, Ghane, Soheila
Long-sequence electroencephalogram (EEG) modeling is essential for developing generalizable EEG representation models. This need arises from the high sampling rate of EEG data and the long recording durations required to capture extended neurological patterns in brain activity. Transformer-based models have shown promise in modeling short sequences of a few seconds; however, their quadratic complexity limits scalability to longer contexts. Moreover, variability in electrode montage across available datasets, along with inter-subject differences in brain signals, pose significant challenges to developing a generalizable and robust foundation model. We propose \textit{SAMBA}, a self-supervised learning framework with a Mamba-based U-shaped encoder-decoder architecture, which effectively captures long-range temporal dependencies and spatial variability in EEG data. Leveraging the inherent ability of Mamba in processing long context sizes, we introduce: (1) \textit{Temporal Semantic Random Masking} for semantic-level sequence reconstruction, (2) a \textit{Multi-Head Differential Mamba} module to suppress redundancy and emphasize salient temporal structures, and (3) a \textit{Spatial-Adaptive Input Embedding} that learns unified embeddings in a three-dimensional Euclidean space, enabling robustness across devices. Experiments on thirteen EEG datasets across diverse tasks, electrode configurations, and sequence durations demonstrate that SAMBA consistently outperforms state-of-the-art methods while maintaining low memory consumption and inference time. We also show the learned spatial weight maps from our embedding module align closely with task-relevant neurophysiological regions, demonstrating the learnability and interpretability of SAMBA. These results highlight SAMBA's scalability and practical potential as a foundation model for real-time brain-computer interface applications.
THD-BAR: Topology Hierarchical Derived Brain Autoregressive Modeling for EEG Generic Representations
Yang, Wenchao, Yan, Weidong, Liu, Wenkang, Ma, Yulan, Li, Yang
Large-scale pre-trained models hold significant potential for learning universal EEG representations. However, most existing methods, particularly autoregressive (AR) frameworks, primarily rely on straightforward temporal sequencing of multi-channel EEG data, which fails to capture the rich physiological characteristics inherent to EEG signals. Moreover, their time-centered modeling approach also limits the effective representation of the dynamic spatial topology of brain activity. To address these challenges and fully exploit the potential of large-scale EEG models, we propose a novel Topology Hierarchical Derived Brain Autoregressive Modeling (THD-BAR) for EEG generic representations. The core innovation of THD-BAR lies in the introduction of the Brain Topology Hierarchy (BTH), which establishes a multi-scale spatial order for EEG channels. This hierarchical structure enables a redefinition of autoregressive learning as a "next-scale-time prediction" problem, effectively capturing both spatial and temporal dynamics. Based on BTH, we design a Topology-Hierarchical Vector Quantized-Variational Autoencoder (THVQ-VAE) for multi-scale tokenization and develop an enhanced Brain Autoregressive (BAR) module with specialized masking strategies for prediction. Through extensive large-scale pre-training on 17 datasets, followed by rigorous validation on 10 downstream datasets spanning 5 distinct tasks, THD-BAR consistently outperforms existing methods. These results highlight the superior generalization and modeling capabilities of our proposed approach.
A Penny for Your Thoughts: Decoding Speech from Inexpensive Brain Signals
Auster, Quentin, Shapovalenko, Kateryna, Ma, Chuang, Sun, Demaio
We explore whether neural networks can decode brain activity into speech by mapping EEG recordings to audio representations. Using EEG data recorded as subjects listened to natural speech, we train a model with a contrastive CLIP loss to align EEG-derived embeddings with embeddings from a pre-trained transformer-based speech model. Building on the state-of-the-art EEG decoder from Meta, we introduce three architectural modifications: (i) subject-specific attention layers (+0.15% WER improvement), (ii) personalized spatial attention (+0.45%), and (iii) a dual-path RNN with attention (-1.87%). Two of the three modifications improved performance, highlighting the promise of personalized architectures for brain-to-speech decoding and applications in brain-computer interfaces.
BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals
Xiao, Qinfan, Cui, Ziyun, Zhang, Chi, Chen, Siqi, Wu, Wen, Thwaites, Andrew, Woolgar, Alexandra, Zhou, Bowen, Zhang, Chao
Electroencephalography (EEG) and magnetoencephalography (MEG) measure neural activity non-invasively by capturing electromagnetic fields generated by dendritic currents. Although rooted in the same biophysics, EEG and MEG exhibit distinct signal patterns, further complicated by variations in sensor configurations across modalities and recording devices. Existing approaches typically rely on separate, modality- and dataset-specific models, which limits the performance and cross-domain scalability. This paper proposes BrainOmni, the first brain foundation model that generalises across heterogeneous EEG and MEG recordings. To unify diverse data sources, we introduce BrainTokenizer,the first tokenizer that quantises spatiotemporal brain activity into discrete representations. Central to BrainTokenizer is a novel Sensor Encoder that encodes sensor properties such as spatial layout, orientation, and type, enabling compatibility across devices and modalities. Building upon the discrete representations, BrainOmni learns unified semantic embeddings of brain signals by self-supervised pretraining. To the best of our knowledge, it is the first foundation model to support both EEG and MEG signals, as well as the first to incorporate large-scale MEG pretraining. A total of 1,997 hours of EEG and 656 hours of MEG data are curated and standardised from publicly available sources for pretraining. Experiments show that BrainOmni outperforms both existing foundation models and state-of-the-art task-specific models on a range of downstream tasks. It also demonstrates strong generalisation to unseen EEG and MEG devices. Further analysis reveals that joint EEG-MEG (EMEG) training yields consistent improvements across both modalities. Code and checkpoints are publicly available at https://github.com/OpenTSLab/BrainOmni.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper propose to use CNN to classify rhythms from EEG recordings. A dataset with 13 subjects is analyzed. Temporal and spatiotemporal (STFT) data representation are investigated. The paper is well written with a good review of the relevant literature.
ELASTIQ: EEG-Language Alignment with Semantic Task Instruction and Querying
Jiang, Muyun, Zhang, Shuailei, Yang, Zhenjie, Wu, Mengjun, Jiang, Weibang, Guo, Zhiwei, Zhang, Wei, Liu, Rui, Zhang, Shangen, Li, Yong, Ding, Yi, Guan, Cuntai
Recent advances in electroencephalography (EEG) foundation models, which capture transferable EEG representations, have greatly accelerated the development of brain-computer interfaces (BCI). However, existing approaches still struggle to incorporate language instructions as prior constraints for EEG representation learning, limiting their ability to leverage the semantic knowledge inherent in language to unify different labels and tasks. To address this challenge, we present ELASTIQ, a foundation model for EEG-Language Alignment with Semantic Task Instruction and Querying. ELASTIQ integrates task-aware semantic guidance to produce structured and linguistically aligned EEG embeddings, thereby enhancing decoding robustness and transferability. In the pretraining stage, we introduce a joint Spectral-Temporal Reconstruction (STR) module, which combines frequency masking as a global spectral perturbation with two complementary temporal objectives: random masking to capture contextual dependencies and causal masking to model sequential dynamics. In the instruction tuning stage, we propose the Instruction-conditioned Q-Former (IQF), a query-based cross-attention transformer that injects instruction embeddings into EEG tokens and aligns them with textual label embeddings through learnable queries. We evaluate ELASTIQ on 20 datasets spanning motor imagery, emotion recognition, steady-state visual evoked potentials, covert speech, and healthcare tasks. ELASTIQ achieves state-of-the-art performance on 14 of the 20 datasets and obtains the best average results across all five task categories. Importantly, our analyses reveal for the first time that explicit task instructions serve as semantic priors guiding EEG embeddings into coherent and linguistically grounded spaces. The code and pre-trained weights will be released.
EEG Foundation Challenge: From Cross-Task to Cross-Subject EEG Decoding
Aristimunha, Bruno, Truong, Dung, Guetschel, Pierre, Shirazi, Seyed Yahya, Guyon, Isabelle, Franco, Alexandre R., Milham, Michael P., Dotan, Aviv, Makeig, Scott, Gramfort, Alexandre, King, Jean-Remi, Corsi, Marie-Constance, Valdés-Sosa, Pedro A., Majumdar, Amit, Evans, Alan, Sejnowski, Terrence J, Shriki, Oren, Chevallier, Sylvain, Delorme, Arnaud
Current electroencephalogram (EEG) decoding models are typically trained on small numbers of subjects performing a single task. Here, we introduce a large-scale, code-submission-based competition comprising two challenges. First, the Transfer Challenge asks participants to build and test a model that can zero-shot decode new tasks and new subjects from their EEG data. Second, the Psychopathology factor prediction Challenge asks participants to infer subject measures of mental health from EEG data. For this, we use an unprecedented, multi-terabyte dataset of high-density EEG signals (128 channels) recorded from over 3,000 child to young adult subjects engaged in multiple active and passive tasks. We provide several tunable neural network baselines for each of these two challenges, including a simple network and demographic-based regression models. Developing models that generalise across tasks and individuals will pave the way for ML network architectures capable of adapting to EEG data collected from diverse tasks and individuals. Similarly, predicting mental health-relevant personality trait values from EEG might identify objective biomarkers useful for clinical diagnosis and design of personalised treatment for psychological conditions. Ultimately, the advances spurred by this challenge could contribute to the development of computational psychiatry and useful neurotechnology, and contribute to breakthroughs in both fundamental neuroscience and applied clinical research.
MR-EEGWaveNet: Multiresolutional EEGWaveNet for Seizure Detection from Long EEG Recordings
Hassan, Kazi Mahmudul, Zhao, Xuyang, Sugano, Hidenori, Tanaka, Toshihisa
Feature engineering for generalized seizure detection models remains a significant challenge. Recently proposed models show variable performance depending on the training data and remain ineffective at accurately distinguishing artifacts from seizure data. In this study, we propose a novel end-to-end model, "Multiresolutional EEGWaveNet (MR-EEGWaveNet)," which efficiently distinguishes seizure events from background electroencephalogram (EEG) and artifacts/noise by capturing both temporal dependencies across different time frames and spatial relationships between channels. The model has three modules: convolution, feature extraction, and predictor. The convolution module extracts features through depth-wise and spatio-temporal convolution. The feature extraction module individually reduces the feature dimension extracted from EEG segments and their sub-segments. Subsequently, the extracted features are concatenated into a single vector for classification using a fully connected classifier called the predictor module. In addition, an anomaly score-based post-classification processing technique is introduced to reduce the false-positive rates of the model. Experimental results are reported and analyzed using different parameter settings and datasets (Siena (public) and Juntendo (private)). The proposed MR-EEGWaveNet significantly outperformed the conventional non-multiresolution approach, improving the F1 scores from 0.177 to 0.336 on Siena and 0.327 to 0.488 on Juntendo, with precision gains of 15.9% and 20.62%, respectively.